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AMENDMENTS TO THE SPECIFICATION: 
Please amend the paragraph beginning at line 7 on page 1, as follows: 

The following seven Applications, including the present Application, are related: 

1. U.S. Patent Application No. 10/ . . filed on No. 10/671,887, filed on 
September 29. 2003 . to Gustavson et al., entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING 
COMPOSITE BLOCKING BASED ON LI CACHE SIZE", having IBM Docket 
YOR920030010US1; 

2. U.S. Patent Application No. 10/ . . filed on No. 10/671.933. filed on 
September 29,2003 , to Gustavson et al, entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING A 
HYBRID FULL PACKED STORAGE FORMAT", having IBM Docket 
YOR920030168US1; 

3. U.S. Patent Application No. 10/ . . filed on No. 10/671,888, filed on 
September 29. 2003 . to Gustavson et al., entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING 
REGISTER BLOCK DATA FORMAT", having IBM Docket YOR920030169US1; 

4. U.S. Patent Application No. 10/ . .filed on No. 10/671,889, filed on 

September 29. 2003 . to Gustavson et al., entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING LEVEL 
3 PREFETCHING FOR KERNEL ROUTINES", having IBM Docket YOR920030170US1; 

5. U.S. Patent Application No. 10/ . .filed on No. 10/671.937. filed on 

September 29, 2003 , to Gustavson et al., entitled "METHOD AND STRUCTURE FOR 
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PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING 
PRELOADING OF FLOATING POINT REGISTERS", having IBM Docket 
YOR920030171US1; 

6. U.S. Patent Application No. 10/ . .filed on No. 10/671.935. filed on 

September 29. 2003 . to Gustavson et al., entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING A 
SELECTABLE ONE OF SIX POSSIBLE LEVEL 3 LI KERNEL ROUTINES", having IBM 
Docket YOR920030330US1; and 

7. U.S. Patent Application No. 10/ , .filed on — No. 10/671.934. filed on 
September 29.2003 . to Gustavson et al, entitled "METHOD AND STRUCTURE FOR 
PRODUCING HIGH PERFORMANCE LINEAR ALGEBRA ROUTINES USING 
STREAMING", having IBM Docket YOR920030331US1, all assigned to the present 
assignee, and all incorporated herein by reference. 

Please amend the paragraph beginning at line 6 on page 9, as follows: 

Therefore, a key idea of the present invention is to store each of these three 
submatrices contiguously in some representation (permutation) that has an optimal advantage 
for the LI cache-FPU register interface of a particular architecture. It is noted that register 
sets such as the FPU registers are referred to herein as the "LP" cache. 

Please amend the paragraph beginning at line 5 on page 13, as follows: 

Details of the FPU is not so important for an understanding of the present invention, 
since a number of configurations are well known in the art. Figure 3 shows an exemplary 
typical CPU 211 that includes at least one FPU 302. The FPU function of CPU 21 1 controls 
the FMAs (floating-point multiply/add), and at least one load/store unit (LSU) 301, which 
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loads/stores data to/from memory device 304 into the floating point registers (FReg's) 303). 
The register set 303 in a co-processor unit such as the FPU can also be considered as the "LP" 
cache. 
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